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We study the problem of searching for a fixed path eoci---ei on a network through random 
walks. We analyze the first hitting time of tracking the path, and obtain exact expression of mean 
first hitting time (T). Surprisingly we find that (T) is divided into two distinct parts: Ti and 

£SJ \ Tq,. The first part T\ — 2m]~[ i ~ 1 d(ei), is related with the path itself and is proportional to the 

degree product. The second part T2 is related with the network structure. Based on the analytic 
results, we propose a natural measure for each path, i.e. ip = Yi=i ^( e 0i an d call it random walk 
path measure(RWPM). tp essentially determines a path's performance in searching and transporting 

^ processes. By minimizing if, we also find RW optimal routing which is a combination of random 

walk and shortest path routing. RW optimal routing can effectively balance traffic load on nodes 
and edges across the whole network, and is superior to shortest path routing on any type of complex 
networks. Numerical simulations confirm our analysis. 
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I. INTRODUCTION 



i-rt . Random walk is a fundamental dynamic process on graphs or networks. It has long been research interest in 

graph theory and related fields. Analysis on random walks has obtained fruitful results about hitting time, cover 
time, mixing rate and so on[l|[2j. Recent studies show that random walks on complex networks can reveal a variety of 

^; important features of the underlying network, such as diameterQ, centrality[J|[a], entropyQ, community structure m|, 
topological structure j8[, etc. Random walk has also been exploited to tackle diffusing[9], searching[10], routing [llj. 
imniunization|12| . trafRc[13J| and communication 14] problems on complex networks. All these research works show 

£> that random walk theory is a very useful tool in the study of complex networks. As is pointed by Newman[15(, the 

ultimate goal of the study of complex networks is to discover governing laws of the workings of systems built upon 
those networks. Therefore, the real power of random walk theory lies in disclosing basic properties of interaction 
between dynamical processes on complex networks and topological structure of these networks. In this letter, we go 
a step forward along this direction. 

Apart from searching for a single node on a network, a random walker can also search for more complex targets. 
Here we study the problem of searching for a fixed path through random walks on complex networks[16]. This is a 
phenomenon that can often be observed around us. For example, many of us have the experience of running into an 
unexpected path on our journey. The problem can be described as follows: Let G(V, E) be a network with node set 
V and edge set E. There is a path R on the network, R = eo^i • ■ • f-i (e«eV, o<i<z). R is also denoted as R(eo — > e{). A 
random walker travels on the network, its t-step trace is W(t) = a^a\ ■ ■ ■ a t (c^ev, o<i<t). W(t) is also called a t-step 
walk. If the walker passes through path R from its initial node eo to its terminal node ei in successive steps, we say 
that the walker finds path R, such an event can be expressed as R C W{t). Let T be the time that the walker finds 
path R for the first time, i.e., T = min{r : R C W(t), t > 0}. By conventions here T is still called first hitting time 
of random walks, though it is now in the case of searching for a given path. We will analyze first hitting time T, 
and give formulas of the probability distribution of T, i.e. f(t) = ProbjA' = t}, and mean first hitting time (T), i.e. 
(T) = J2"tLo tf(t)- Based on the analytic results, we will propose a random walk path measure (RWPM) for each path 
and show that RWPM plays an important role in searching and transporting processes. 

The outline of this paper is as follows. In Section II, we derive in detail the exact expressions of first hitting 
probability f(t) and mean first hitting time (T). Surprisingly we find that (T) is divided into two distinct parts: Ti 
and T%. The first part T± is related with the path itself and the second part Ti is related with the network structure. 
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In Section III, we define random walk path measure(RWPM) based on our analytic results, and discuss its meaning 
and applications in searching and transporting processes. In Section IV, we give our conclusions. 

II. ANALYSIS OF FIRST HITTING TIME 

Now let's first proceed with the first hitting time problem. To simplify our discussion, we consider only simple 
connected networks, where simple means without loops and multiple edges. Let \V\ = n be the number of nodes, 
and \E\ = m be the number of edges. The adjacent matrix of the network is A = (aij) nxn , where a^ = 1 if node i 
and node j are linked by an edge, otherwise a^ = 0. d(i) is the degree of node i , i.e. d{i) = y^_ 1 a^ . Suppose a 
random walker starts at source node s, and wanders on the network. Each time it jumps with equal probability onto 
one of its neighboring nodes. The transition probability matrix of the walker can be expressed as P = DA, where D 
is a diagonal matrix, D = diag(l/d(l), ..., l/<i(n)), and P = (pij)nxn, Pij = aij/d(i). The probability that the walker 

goes from node i to node j in m steps is p\j — {P m }ij. 

A path R on the network is a sequence of distinct nodes defined as above, where c^+i (o<i<z-i) are edges in E. Let 
u = eo, v = ei, then R = R(u — > v). The length of R is I. Without loss of generality, we assume / > 1. If the walker 
finds path R, it tracks nodes of R sequentially, so there is a subwalk in W{t) equal to R, i.e., R = <Ji ■ ■ • o^+j (o<i, i+i<t), 
which is denoted as R C W(t). If the first hitting time T = r, then the walker tracks R at step r for the first time, i.e., 
R = a T ^i ■ ■ ■ <7 r . In the following, we will try to calculate the first hitting probability j(i) = Prob{T = t}. However, 
it is not easy to compute /(£) directly. We need some techniques. 

Let's consider a conditioned probability 9{t). Without loss of generality, we assume s ^ u in the proceeding 
discussion. If the walker starting from source node s arrives at u, the initial node of R, at step t, then what is the 
probability of finding path R in these t steps? Such a probability can be expressed as 

6{t) = Prob{T < t | <7 Q = s, <r t =u} 

= Probji? C W (t)} (1) 

where Wo(t) is a i-step walk starting from s and arriving at u at the last step. From eq.([T]) it is easily seen that 
9(t) = Ofor t = 0, . . . ,1 + 1, Let W^(t) be a Wo(t) that contains R, and Wq (t) be a Wo(t) that does not contain R. 
In order to calculate #(£), we divide W^(t) into three subwalks: 

Co •••ay, eo ■••£;, a r+ r--a< (2) 

where ay = e$, e; = ay+j, oy = eo. The three subwalks in eq.([2]) also looks as s ■ ■ ■ u, u ■ ■ ■ u, v ■ ■ ■ u. The first subwalk 
starts from source node s and arrives at u at the last step, but it does not contain R. So the first subwalk can be 
written as W^~(r). Therefore, 9(t) can be computed by using iterations as below: 

6{t) = Prob{X <t\a = s,a t =u} = Probji? C W (t)} 
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Let £?ij(x) be the generating function of {j>\, }^= i anc ^ ©(^O be the generating function of {0(£)}£L O , then ^ij(x) = 
J2t^=oPi j x * an d ©(z) = J2^Lo 9(t) xt - Since < p\\9(t) < 1, ^ij(x) and 9(x) are convergent Va; G (—1, 1). By using 
the expression of t-step transition probability @|: p\' = 2m/d(j) + X)fc=2 ^k£ki£kj y/d(j)/d(i), we have 



where A& are eigenvalues of matrix Q = D _1 / 2 PD 1 / 2 , and £/- are corresponding eigenvectors. Consequently, <d(x) 
can be expressed as 
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therefore we have 



Now we are ready to compute f(t). Notice that T = t means that the walker finds path R for the first time at exactly 
the i'th step. We denote such a walk as W^~(t). W±{t) can be divided into two subwalks: cr • • • a t -i , eo • • • e;, with 
o t -i = eo- Wi (t) also looks as s . . . u, u . . . v. The first subwalk does not contain R, so it can be denoted as Wq (t — l). 
Note that f(t) = 0(o<t<i) since the walker can not find path R in less than I steps. For t > I, f(t) can be computed 
as below: 
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By using eq.® and eq.© the generating function of {f(t)}^L can be derived as below 
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/(£) can be calculated by taking derivatives of &{x), i.e., f(t) — — — 

and eq. (|10p . mean first passage time (T) can be calculated as the following: 
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Up to now, we have obtained main results of this paper, i.e., the probability distribution f(t) and mean first hitting 
time (T). From eq. flTTT) we see that the mean first hitting time is divided into two parts, (T) = T\ + Ti- The first 
part T\ is 2?TT,r| i=1 d(ei), which is proportional to the product of degrees of nodes on path R except the initial and 
terminal nodes. The second part T^ is more complicated since it includes information of topological structure of the 
network. But if we fix source node s, initial node u and terminal node v, then the second part T2 is constant for all 
paths between u and v, and also independent of path length I. This is a somewhat surprising result. Nevertheless it 
gives us a standpoint to evaluate and compare all paths between u and v. In fact, the difference only comes from the 
first part, or simply Yii=i d{ti), the product of degrees of nodes on the path. Judging by our intuitions, we know that 
this is a reasonable result. 



III. RANDOM WALK PATH MEASURE 

Based on the above analysis, we define a function tp(R(eo — > ei)) = Ili=i ^( e f° r path R. In comparing all paths 
between two points u and v, if has the same effect as the mean first hitting time (T). That is to say, a path with 
small (T) will also has small (p, vice versa. Therefore, (p is a natural measure associated with each path. We call it 
random walk path measure(RWPM). In fact, RWPM determines a path's performance in searching and transporting 
processes. In the following, we use two examples to make this point clear. 

The first example shows that paths between a pair of nodes have different performance in searching processes 
according to their RWPMs. There are four shortest paths between node 70 and node 96 in a small network(Fig 1(a) I, 
they have the same initial node u = 70 and the same terminal node v = 96. Details of these paths are listed in TABLE 
HI <p± = 180 is the smallest and ipi — 936 is the largest. In Fig 1(b) we see that distribution of their first hitting 
probability are different, route4 has the largest first hitting probability all the time while routel has the smallest first 
hitting probability. That is to say, it is easy to search for route4 and difficult to search for routel if a random walker 
starts from same source node s. Here we see that searching efficiency of all paths between u and v are determined by 
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FIG. 1: Comparing shortest paths by their first hitting probability, (a) A simple scale- free network with fOO nodes generated 
by using BA model[17|]. (b) Distribution of first hitting probability of four shortest paths between a pair of nodes. The 
information of the nodes on these paths is list in TABLE [T] 



TABLE I: Four shortest paths on a small BA network(Fig |l (a)[ ). For example, routerl is 70-19-4-31-96. The initial node is 70 
and the terminal node is 96. The path length is 4. Node degree is shown in brackets. The last column is ip value. 
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their RWPM, the smaller the RWPM of a path, the more easily it will be found. By the way, we also point out here 
that shortest paths can be further distinguished by comparing their RWPMs. 

The second example shows that RWPM can be applied in transportation to design more efficient routing strategy. 
In recent years, the issue of transportation on complex networks attracted a lot of attention from researchers in this 
field 18] [19] 20] 21] [22] 23] [24]. One of the most important problems in transportation is to balance load on nodes and 
edges across the whole network so as to enhance the network's overall transporting capacity. This can be achieved by 
designing better routing strategies that are able to undo correlation between traffic load and structure of the network. 
Some efforts have been made towards this direction. Danila et al proposes an extreme optimization algorithm to 
balance traffic on a network by minimizing the maximum node betweenness [201] . Scholz et al suggests smoothing 
techniques to distribute load evenly across the network 21] . However, their algorithms are not practical for designing 
routing strategy because of either overwhelming computational cost or unstable output. Theoretically random walks 
can accomplish perfect decorrelation between traffic load and network structure since in steady state, a random 
walker visits any node i with probability d(i)/2m, proportional to the node's degree, and visits any edge with equal 
probability l/2m[2j. However, random walk can not produce stable routing. Hence we propose a new routing strategy 
by using paths with smallest RWPM, and we call it RW optimal routing or RW optimal path. In other words, among 
all paths between a pair of nodes u and v, RW optimal paths are those making ip minimum. Such a routing strategy 
naturally inherits merits from both random walk and shortest path routing. On one hand, RW optimal path has 
minimum ip value, so it can not be too long. In fact, RW optimal path is only slightly longer than shortest path on 
average, for example, in a BA network illustrated in FigJSJ the average path length is 3.2 for shortest paths while the 
average path length is 3.4 for RW optimal paths. On the other hand, minimum degree product also ensures that RW 
optimal paths are the least possible to be interfered from outside since they have minimum branches linking other 
parts of the network. Therefore, among all paths between a pair of nodes, RW optimal path is the least possible to 
get into congestion. 



Fig[2]shows simulation results on a BA scale-free network. Fig 2(a) is distribution of node betweenness and Fig 2(b) 
is distribution of edge betweenness. The betweenness centrality is calculated by the method introduced in [25|. In 
Fig 2(a) we see that node betweenness has the form B(k) ~ k® [20], /3 » 0.95 for RW optimal routing and j3 « 1.82 for 



shortest path routing. The node betweenness for RW optimal routing is close to a linear function of degree k. We also 
note that the largest node betweenness is 64138 for RW optimal routing while the largest node betweenness is 295948 
for shortest path routing, The former is only one fifth of the latter. This is the first sign that RW optimal routing 
undo correlation between traffic load and network structure [2l|. In Fig|2(b)|cdge betweenness are distributed much 
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FIG. 2: Node betweenness and edge betweenness for shortest path routing (hop) and RW optimal routing (product) on a BA 
scale-free network with 2000 nodes, (a) Node betweenness B(k) vs node degree k. Node betweenness is averaged according to 
node degree, (b) Frequency of edge betweenness. 



more narrowly for RW optimal routing than for shortest path routing. In the same time, largest edge betweenness is 
much smaller for RW optimal routing than for shortest path routing. In fact, the largest edge betweenness is 2431 
for RW optimal routing while the largest edge betweenness is 13303 for shortest path routing. This is the second sign 
that RW optimal routing undo correlation between traffic load and network structure 2l| . All these manifest that 
traffic load is distributed more evenly by using RW optimal routing than by using shortest path routing. We also 
make simulations on other type of networks, such as random graph, scale-free networks with tunable clustering [2 7], 
internet alike networks [28J. and so on, and get similar results. In all the cases, RW optimal routing is superior to 
shortest path routing. 



IV. CONCLUSIONS 

In summary, we have studied the problem of searching for path on complex networks through random walks and 
derived formulas for the distribution of first hitting time /(£) and mean first hitting time (T). We find that (T) is 
divided into two distinct parts with totally different features: the first part T± is proportional to the product of the 
degrees of nodes on the path, i.e., T\ — 2m ]X=i d(ei), and the second part T2 is determined by the topological structure 

of the network. Based on the analytic results, we define a function ip for each path R, tp(R(ea — > e/)) = Yli=i ^( e i)- 
ip is called random walk path measure(RWPM). Since it is derived from the mean first hitting time, tp is a natural 
measure associated with each path. We have shown that ip essentially determines a path's performance in searching 
and transporting processes. By minimizing <p we can also find RW optimal routing. RW optimal routing inherits 
merits from both random walk and shortest path routing. So it is superior to shortest path routing in all cases. 
Numerical simulations confirm our conclusion. 



[1] B. Bollobas, Modern Graph Theory, Springer, New York, 1998. 

[2] L.Lovasz, Random walks on graphs: a survey, Combinatorics, Paul Erdos is Eighty, Keszthely (Hungary) 2 (1993) 1-46. 

[3] S.Lee, S.-H.Yook, Y.Kim, Random walks and diameter of finite scale-free networks, Physica A 387 (2008) 3033-3038. 

[4] M.E.J. Newman, A measure of betweenness centrality based on randonr walks, Social Networks 27 (2005) 39-54. 

[5] R.Guimera, A.Diaz-Guilera, Optimal network topologies for local search with congestion, Phys. Rev. Lett. 89 (248701), 

2002. 
[6] M. Rosvall, A. Trusina, P. Minnhagen, K. Sneppen, Networks and cities: An information perspective, Phys. Rev. Lett. 

94 (028701), 2005. 
[7] H.Zhou, Network landscape from a brownian particles perspective,, Phys. Rev. E. 67 (041908), 2003. 
[8] S. Yoon, S. Lee, S. Yook, Y. Kim, Statistical properties of sampled networks by random walks, Phys. Rev. E. 75 (046114), 

2007. 
[9] S.Lee, S.-H.Yook, Y.Kim, Diffusive capture process on complex networks, Phys. Rev. E. 74 (046118), 2006. 



[10] N. Bisnik, A. A. Abouzeid, Optimizing random walk search algorithms in P2P networks, Computer Networks 51 (2007) 

1499-1514. 
[11] H. Tian, H. Shen, Random walk routing in WSNs with regular topologies, Journal of Computer Science and Technology 

21 (4) (2006) 496-502. 
[12] H. Ke, T. Yi, Immunization for scale-free networks by random walker, Chinese Physics 15 (12), 2006. 
[13] R. Germano, A. P. S. de Moura, Traffic of particles in complex networks, Phys.Rev.E. 74 (036117), 2006. 
[14] J.D.Noh, H.Rieger, Random walks on complex networks, Phys.Rev.Lett. 92 (118701), 2004. 
[15] M. E. J. Newman, The structure and function of complex networks, SIAM REVIEW 45 (2) (2003) 167-256. 
[16] S.-P. Wang, W.-J. Pei, Detecting unknown paths on complex networks through random walks, Physica A 388 (2009) 

514-522. 
[17] A.-L.Barabasi, R.Albert, Emergence of scaling in random networks, Science 286 (1999) 509-512. 

[18] Z. Chen, X. Wang, Effects of network structure and routing strategy on network capacity, Phys.Rev.E. 73 (036107), 2006. 
[19] P. Echenique, J. Gomez-Gardenes, Y. Moreno, Improved routing strategies for internet traffic delivery, Phys.Rev.E. 

70 (056105), 2004. 
[20] B.Danila, Y.Yu, J.A. Marsh, K.E.Bassler, Optimal transport on complex networks, Phys.Rev.E. 74 (046106), 2006. 
[21] J.Scholza, W.Krauseb, M.Greinerc, Decorrelation of networked communication flow via load-dependent routing weights, 

Physica A 387 (2008) 2987-3000. 
[22] W. -X.Wang, B.-HWang, Traffic dynamics based on local routing protocol on a scale-free network, Phys.Rev.E. 73 (026111), 

2006. 
[23] W. -X.Wang, C.-Y.Yin, GYan, B. -H.Wang, Integrating local static and dynamic information for routing traffic, Phys.Rev.E. 

74 (016101), 2006. 
[24] B.Tadic, Guided search and distribution of information flow on complex graphs, Lecture Notes in Computer Science 3038 

(2004) 1086-1093. 
[25] G. Yan, T. Zhou, et al, Efficient routing on complex networks, Phys.Rev.E. 73 (046108), 2006. 
[26] K. I. Goh, B. Kahng, D. Kim, Universal behavior of load distribution in scale- free networks, Phys.Rev.Lett 87 (278701), 

2001. 
[27] P. Holme, B. J. Kim, Growing scale- free networks with tunable clustering, Phys.Rev.E. 65 (026107), 2002. 
[28] S. Zhou, R. J. Mondragon, Accurately modeling the internet topology, Phys.Rev.E. 70 (066108), 2004. 



